179 research outputs found
Improved rates for Wasserstein deconvolution with ordinary smooth error in dimension one
This paper deals with the estimation of a probability measure on the real
line from data observed with an additive noise. We are interested in rates of
convergence for the Wasserstein metric of order . The distribution of
the errors is assumed to be known and to belong to a class of supersmooth or
ordinary smooth distributions. We obtain in the univariate situation an
improved upper bound in the ordinary smooth case and less restrictive
conditions for the existing bound in the supersmooth one. In the ordinary
smooth case, a lower bound is also provided, and numerical experiments
illustrating the rates of convergence are presented
Statistical learning for wind power : a modeling and stability study towards forecasting
We focus on wind power modeling using machine learning techniques. We show on
real data provided by the wind energy company Ma{\"i}a Eolis, that parametric
models, even following closely the physical equation relating wind production
to wind speed are outperformed by intelligent learning algorithms. In
particular, the CART-Bagging algorithm gives very stable and promising results.
Besides, as a step towards forecast, we quantify the impact of using
deteriorated wind measures on the performances. We show also on this
application that the default methodology to select a subset of predictors
provided in the standard random forest package can be refined, especially when
there exists among the predictors one variable which has a major impact
Projection-based curve clustering
This paper focuses on unsupervised curve classification in the context of nuclear industry. At the Commissariat à l'Energie Atomique (CEA), Cadarache (France), the thermal-hydraulic computer code CATHARE is used to study the reliability of reactor vessels. The code inputs are physical parameters and the outputs are time evolution curves of a few other physical quantities. As the CATHARE code is quite complex and CPU-time consuming, it has to be approximated by a regression model. This regression process involves a clustering step. In the present paper, CATHARE output curves are clustered using a k-means scheme, with a projection onto a lower dimensional space. We study the properties of the empirically optimal cluster centers found by the clustering method based on projections, compared to the “true” ones. The choice of the projection basis is discussed, and an algorithm is implemented to select the best projection basis among a library of orthonormal bases. The approach is illustrated on a simulated example and then applied to the industrial problem
On principal curves with a length constraint
Principal curves are defined as parametric curves passing through the ``middle'' of a probability distribution in R^d. In addition to the original definition based on self-consistency, several points of view have been considered among which a least square type constrained minimization problem.In this paper, we are interested in theoretical properties satisfied by a constrained principal curve associated to a probability distribution with second-order moment. We study open and closed principal curves f:[0,1]-->R^d with length at most L and show in particular that they have finite curvature whenever the probability distribution is not supported on the range of a curve with length L.We derive from the order 1 condition, expressing that a curve is a critical point for the criterion, an equation involving the curve, its curvature, as well as a random variable playing the role of the curve parameter. This equation allows to show that a constrained principal curve in dimension 2 has no multiple point
Estimation via length-constrained generalized empirical principal curves under small noise
In this paper, we propose a method to build a sequence of generalized empirical principal curves, with selected length, so that, in Hausdor distance, the images of the estimating principal curves converge in probability to the image of g
On two extensions of the vector quantization scheme
National audienc
- …